Accelerated evolution of SARS-CoV-2 in free-ranging white-tailed deer

The zoonotic origin of the COVID-19 pandemic virus highlights the need to fill the vast gaps in our knowledge of SARS-CoV-2 ecology and evolution in non-human hosts. Here, we detected that SARS-CoV-2 was introduced from humans into white-tailed deer more than 30 times in Ohio, USA during November 2021-March 2022. Subsequently, deer-to-deer transmission persisted for 2–8 months, disseminating across hundreds of kilometers. Newly developed Bayesian phylogenetic methods quantified how SARS-CoV-2 evolution is not only three-times faster in white-tailed deer compared to the rate observed in humans but also driven by different mutational biases and selection pressures. The long-term effect of this accelerated evolutionary rate remains to be seen as no critical phenotypic changes were observed in our animal models using white-tailed deer origin viruses. Still, SARS-CoV-2 has transmitted in white-tailed deer populations for a relatively short duration, and the risk of future changes may have serious consequences for humans and livestock.

More than 10% of the samples collected from WTD for this study were positive by rRT-PCR for SARS-CoV-2 (163/1522, 10.7%, 95% CI 9.2 -12.4%, Table S1). In more than half of the counties sampled (59.0%, 49/83), at least one SARS-CoV-2 positive WTD was identi ed (Figure 1, Table S2). Positive samples were retrieved from Ohio's three largest metropolitan areas surrounding Cleveland, Columbus, and Cincinnati, as well as from rural areas ( Figure 1). The only region with no positive samples was the rural northwest, where sample collection was limited. There was no signi cant difference in the percentage of urban counties with at least one positive sample (68.4%, 13/19) compared to rural counties (56.2%, 36/64) (Chi2 0.90, p = 0.34). Accounting for county level clustering, male WTD had increased odds of infection compared to females (OR = 1.5, p = 0.034, Table S3), but the age of the WTD had no signi cant effect. Samples collected in December had 1.8x higher odds of infection compared to November, when most samples were collected ( Figure S1, Table S3). Odds of infection were even lower in January, and no samples from February or March 2022 tested positive. Hunter-harvested WTD had similar odds of infection in urban and rural counties (Table S3). In contrast, culled WTD in rural counties had higher odds of infection compared to WTD culled in urban counties (OR=14.7, p=0.006, Tables 1 & S3). Only 0.6% of WTD culled from urban counties were positive by PCR, but 22.2% of blood samples from the same WTD were seropositive for antibodies (Table S3b), indicating previous exposure in a population with very few active infections. Overall, the estimated seroprevalence of SARS-CoV-2 in Ohio WTD was 23.5% (274/1164, 95% CI 21.1 -26.1%, Table S1). At least one seropositive WTD was sampled in 66.3% (55/83) of counties. Even in an urban county like Franklin (which includes Columbus), where all 149 samples from culled WTD were negative by PCR, estimated seroprevalence was 19%. Thus, urban culled WTD with few infections within our sampling time interval still had similar likelihood of previous exposure to other WTD tested for antibodies. Almost 70% (804/1164) of WTD were negative by both PCR and surrogate virus neutralization test (sVNT), indicating that a large portion of the WTD population in Ohio remained naïve to SARS-CoV-2. Fourteen counties were negative by both PCR and serology, all of which had 10 or fewer serum samples (Table S2). Given that seroprevalence was similar in male and female WTD, in urban and rural counties, and across months, we found no signi cant predictors for seropositivity.
Whole-genome SARS-CoV-2 sequences were obtained from 34 of the 49 counties with at least one positive WTD (69.4%), for a total of 80 sequences ( Figure 1, Table S4). Nearly 70% of these sequences were from rural counties, corresponding closely to the distribution of PCR positive samples (74% from rural counties, Table S1). Ninety-ve percent of sequenced WTD viruses were sampled when the delta variant was dominant in humans, prior to December 15, 2021 ( Figure 2, Table S4).
Alpha variants persist in Ohio white-tailed deer.
The vast majority of the sequenced WTD viruses (88.8%, 71/80) belonged to the delta variant (B.1.617.2 and AY PANGO lineages), matching the dominant variant circulating in humans at the time (Table S4). The remaining nine WTD viruses belonged to the B.1.1.7 lineage (alpha variant) that circulated in humans in Ohio six months earlier, during the spring of 2021 (Table S4). The nine B.1.1.7 viruses were collected from four counties in southern Ohio (Brown, Highland, Meigs, Pickaway) during November and December 2021. The last human B.1.1.7 virus in Ohio was collected on August 23, 2021 (EPI_ISL_3897556). The B.1.1.7 viruses collected from Ohio WTD cluster in two separate clades on the phylogenetic tree, consistent with two independent human-to-deer transmission events (Figures 2A and S2). The time-scaled MCC tree indicates that two human-to-deer transmission events occurred during the spring B.1.1.7 wave, an estimated 5-8 months prior to the detection of B.1.1.7 in WTD (Figures 2A, 2C, 2E). Together, these ndings suggest that two B.1.1.7 introductions from humans were followed by sustained deer-to-deer transmission in Ohio for more than ve months.
Delta variants frequently transmitted from humans to white-tailed deer.
The timing of the delta wave in humans ( Figures 2D and S3) coincided with the timing of the late autumn WTD hunting season in Ohio, leading to frequent introduction of delta variant viruses from humans to WTD. Twelve PANGO lineages considered delta variant (eleven AY lineages and B.1.617.2) were identi ed in WTD, representing a minimum of 12 independent human-to-deer delta transmissions ( Figure S3). Most AY lineages were introduced into WTD from humans more than once, as evidenced by multiple independent clades in the phylogenetic tree ( Figures 2B, S2, S4). In total, more than 30 delta reverse zoonosis events in Ohio were inferred across the branches of the MCC tree, based on "Markov jump" counts that infer the number of human-to-deer transmissions ( Figure 2E). The AY lineages detected at the highest frequency in WTD (AY.103 and AY.25) were also dominant in humans in Ohio ( Figures S5-S6). In general, the genetic composition of delta variants in Ohio WTD matched the dominant lineages found in humans at that time, with a positive association between AY lineage frequencies in humans and in WTD ( Figure S7). The MCC tree estimates that delta variants were introduced from humans to WTD during the autumn of 2021, coinciding with the delta wave in humans in Ohio, approximately 2-3 months before their descendants were detected in WTD by surveillance ( Figures 2D and 2E). Frequent reverse zoonotic transmission seeded high genetic diversity in Athens and Washington counties, where four SARS-CoV-2 lineages were identi ed during a single week (week 49, Figure S8). Each introduction transmitted onward in WTD enough to be picked up by surveillance, but not all introductions spread to multiple counties, and many are likely to be transient. Spatial diffusion of alpha and delta variants in white-tailed deer.
Onward deer-to-deer transmission was observed in 16 WTD clusters (14 delta and 2 alpha, Figure 3). Eleven clusters spanned multiple counties, including four that spanned three or more counties (Figures 3 and S9). The largest spatial clusters were found in rural areas. Delta clusters typically spanned neighboring counties that share a border, whereas alpha clusters included viruses from distant, non-contiguous counties separated by hundreds of kilometers. Longer distance dispersal of alpha variants is consistent with the longer time frame of alpha circulation in Ohio WTD inferred from the MCC tree (5-8 months, Figures 2B and 2C). At this level of sampling, every virus detected in a WTD, including 20 singletons positioned outside observed transmission clusters on long branches, likely belongs to a sampled or unsampled transmission cluster in deer.
SARS-CoV-2 evolves faster in white-tailed deer than in humans.
To explore whether WTD viruses' long branches ( Figure S2) and deviation from genetic distance root-to-tip regressions ( Figure  S10) arise because SARS-CoV-2 was evolving faster in WTD than in humans, we performed an in-depth analysis of all mutation events, scanned for positive and negative selection across the SARS-CoV-2 genome, and estimated rates of nucleotide substitution using a Bayesian approach. The estimated rate of SARS-CoV-2 evolution (substitutions per site per year) was approximately three times higher in WTD compared to humans for the alpha variant and 2.7 times higher for the delta variant (Figures 4A and S11, Table S5). The rate was higher in WTD for both synonymous and non-synonymous substitutions across all genome regions, including the spike protein ( Figure S12). However, the rate of non-synonymous substitution in WTD was not as high as would be expected, given the background rate of synonymous substitution. As a result, the ratio of non-synonymous to synonymous substitutions (dN/dS) was lower in WTD than in humans, across the genome, for both alpha and delta variants ( Figure S12). The dN/dS ratio for delta spike protein was signi cantly lower in WTD compared to humans (Figure S12), evidence of predominantly purifying selection that removes non-synonymous changes (Table S9). A high rate of non-synonymous substitution was observed in WTD for the genome partition that includes ORFs 3-8, the envelope, and membrane proteins. ORF3a had a high dN/dS ratio in WTD in an analysis of selected genes (See methods, Table S9), warranting a deeper examination of selection across a wider set of genes.
Mutational biases in white-tailed deer.
To investigate the possible causes of faster SARS-CoV-2 evolution in WTD compared to humans, we examined differences in mutational bias between human and deer branches in the phylogenetic tree. WTD had a signi cantly higher C-to-T mutational bias for both alpha and delta variants ( Figure 4C-H, Tables S6-S7). This pattern held for all mutations, including synonymous mutations, and could not be attributed to selection, low WTD sample size, or differences in codon usage between WTD and humans (Figures S13-S16). Most C>T mutations occurred in the WCW context (where W can be A or T), which has been linked to APOBEC activity in humans 22-24 , and APOBEC activity could increase C>T mutation biases in WTD.
Recurrent mutations in spike protein in WTD. Recurrent mutations arose independently in different WTD transmission clusters that could potentially represent individual sites under positive selection (Figures S17-S18). For example, the L18F mutation in the spike protein N-terminal domain occurred independently in six WTD delta clusters and two singletons ( Figure 4B; Table S8; Figure S17). This site was estimated to be subject to episodic diversifying positive selection (p-value 0.0001). The S:L18F mutation occurred during the Hong Kong hamster outbreak 13 and reduced antibody binding in the gamma variant in humans 25 . But the mutation fell to low frequencies (<1%) globally in delta and omicron ( Figure S19). Seven additional mutations were located on the spike protein surface, including N501Y in the receptor binding domain (RBD) (occurred twice) and H69Y. S:N501Y was a key mutation 26 in human variants (alpha, beta, gamma, omicron, mu), but was rarely observed in delta viruses in humans. No recurrent spike mutations were found in the smaller alpha dataset, but S:T29I was observed in both alpha and delta.
In vitro growth kinetics.
Virus isolation was attempted on 75 of the 163 PCR positive nasal swabs, yielding 27 SARS-CoV-2 isolates for further laboratory characterization. We analyzed the replication kinetics of nine representative WTD isolates (Table S10) Figure 5A). The α-BNT162b2 serum had detectable neutralizing titers against all Alpha and Delta lineage viruses tested except for the AY.118 isolate. Overall, some exceptions notwithstanding, most WTD viruses were antigenically similar to their human counterparts.

In vivo experiments.
A subset of WTD viruses was selected for analysis of viral replication and pathogenesis in Syrian Golden hamsters, an established animal model for SARS-CoV-2 variants 27-29 . Throughout the infection in unvaccinated animals, there was no substantial difference in the weight loss observed between the animals infected with different viruses ( Figure 5B). In each case, prior immunization with BNT162b2 led to decreased weight loss, suggesting that current vaccines would be expected to provide protection from severe disease if these viruses were to spill back into the human population ( Figure 5B). No differences were observed between the Hu-B.1.617.2 and the AY.3 or AY.25 nasal washes from unvaccinated hamsters. The Hu-B.1.1.7 virus showed a signi cantly higher titer at 2 dpi compared to the B.1.1.7-like strain, but that difference was not observed for the later timepoints ( Figure 5C). Vaccination resulted in a reduction in viral titers compared to unvaccinated hamsters for the majority of strains at 2 dpi and was most pronounced at 4 dpi ( Figure 5E, 5D).

Discussion
Three years into the COVID-19 pandemic, gaps in our knowledge of the virus's broader ecology and evolution in non-human hosts impede efforts to fully resolve the pandemic's zoonotic origins and predict its evolutionary future. Here, by capturing the early phase of SARS-CoV-2 transmission in WTD, we observe how a exible generalist virus invaded a new host species without extensive adaptive evolution or phenotypic change. The persistence of WTD transmission clusters for 2-8 months provided data with enough resolution to robustly estimate that SARS-CoV-2 evolves about three times faster in WTD than in humans SARS-CoV-2 evolution in WTD using Bayesian approaches. The high rate of SARS-CoV-2 evolution, apparent lack of extensive adaptive evolution, and absence of phenotypic change all suggest that the virus is evolving under purifying selection and transmitting e ciently in WTD without (thus far) necessitating major evolutionary changes.
Other zoonotic viruses, such as in uenza A virus, evolve at different rates in humans and other mammalian species. Higher overall rates of in uenza A virus evolution in non-human hosts (e.g., swine and birds) 30 raise questions about how virus evolution is modulated by population turnover, host metabolism, the activity of RNA editing enzymes, in particular APOBEC, and other physiological and ecological factors. In uenza serves as a reminder of how pathogens that jump from humans to animals can establish reservoirs with different evolutionary trajectories that can ultimately have consequential effects on animal and human health 31 . While new, tter in uenza variants continuously replace and purge older diversity in humans, pigs sustain older human strains that shu e their genomes through reassortment and periodically spill over into humans, turkeys, and canines, causing localized outbreaks or global pandemics. It remains questionable whether the ecology and contact rates of free-ranging WTD can sustain an acute respiratory pathogen like SARS-CoV-2 over the long term without involving other host species or environmental persistence. The enigma of how SARS-CoV-2 transmits so frequently from humans to WTD re ects broad gaps in our understanding of SARS-CoV-2 ecology.
Our study demonstrates that SARS-CoV-2 viruses sustained transmission in free-ranging WTD for up to eight months in Ohio in 2021. First, humans seeded dozens of SARS-CoV-2 outbreaks in WTD, scattered across the state. Then, WTD movements spread some viruses regionally over hundreds of kilometers, both in rural areas and on the outskirts of metropolitan areas. The persistence and clustering of alpha and delta lineage SARS-CoV-2 we detected in Ohio WTD are consistent with recent results from New York 7 . To understand how SARS-CoV-2 transmits deer-to-deer between subpopulations, future efforts should integrate WTD movement data collected over many years by wildlife biologists tracking pathogens of concern for livestock, such as chronic wasting disease and tuberculosis 32,33 . Our ndings are broadly consistent with patterns observed in other WTD diseases, including higher infection odds in males that travel and contact other WTD during breeding season 34 . Higher infection odds in December could be a bump after peak mating season, which increases WTD contact rates and movements through early November. Rising delta cases in humans could also be a factor. The timing of our sampling coinciding with the fall hunting season seemed to play a large role in where we detected active outbreaks. All these hypotheses require quantitative testing with mathematical models built from empirical movement data and ner resolution, year-round virological surveillance.
The sheer frequency of human-to-deer spillover observed in Ohio -and North America -is consistent with a growing consensus that cross-species transmission occurs far more frequently than previously detected. The yet-to-be-determined interface for human-to-deer transmission events appears to exist across the entirety of Ohio, regardless of proximity to major metropolitan areas. We cannot exclude the possibility that all detected introductions are due to direct human contact, although it seems unlikely. Indirect environmental transmission through wastewater and stormwater 35,36 is possible, but the virus has never been successfully isolated from wastewater, let alone run-off. The scope of airborne transmission of SARS-CoV-2 in indoor and outdoor environments has been a source of lively discussion since the onset of the pandemic.
Experimental studies of SARS-CoV-2 transmission in WTD and other wildlife are needed but conducting such experiments in BSL-3 adds cost and logistical hurdles. Other wildlife or domestic species susceptible to SARS-CoV-2 infection may have contact with and facilitate transmission to WTD. Recent work in Virginia, USA, has identi ed high seroprevalence in species with shared human habitats including raccoons, squirrels, skunks, and white-footed and deer mice, in addition to viral shedding in a opossum 37 . These species could plausibly interface with humans and WTD. However, surveillance in wildlife is limited, which makes identi cation of other potential intermediate hosts di cult.
There are approximately 30 million WTD in the United States, and that population is increasingly in close contact with humans 38 . Although there is evidence of transmission of WTD origin SARS-CoV-2 to humans 4 , no substantial outbreaks of deer-origin SARS-CoV-2 in humans have been reported. Our experimental results indicate that, in general, the WTD lineage viruses have not undergone substantial enough antigenic change to be a risk for immune and vaccine escape in human populations. Going forward, we predict that most delta transmission clusters seeded in WTD will die off and fail to persist long-term, a common phenomenon in SARS-CoV-2 population genetics that has been documented in humans in intensively sequenced locations 39 . However, as we observed for alpha, a subset might persist in WTD. Data from 2022 will soon reveal how successfully omicron variants invaded North American WTD and whether co-circulation of delta, alpha, and omicron provided breeding grounds for new recombinants. SARS-CoV-2 is currently not considered an important risk for North American livestock, but the continued spread of the virus in WTD, humans, and other hosts could open new pathways for SARS-CoV-2 evolution that we are only beginning to uncover. 16.

Methods
Sample Collection. Previously, we collected nasal samples from WTD in northeastern Ohio focusing on metropolitan area parks 1 . In the present study, we expanded geographical study area by approximately 1000-fold to target the entire state of Ohio. We collected 1522 nasal swabs from WTD across 83 of Ohio's 88 counties from October 2021 to March 2022. Of those, 713 samples were opportunistically collected from hunter-harvested WTD during hunting season (November to December 2021) in 81 counties in Ohio, drawing primarily from rural areas. In addition to hunted WTD, we also collected 801 samples from WTD culled during deer population management programs from 9 counties in Ohio. We had 2 additional samples collected from roadkill WTD, and 6 nasal swabs that did not have su cient metadata to identify the manner of death. We had a large sample size discrepancy between adult (n=1,199) and juvenile WTD (n=297), which contributed to the reduction of power and may bias the estimates. This large difference in sample between ages is likely due to hunters' common preference to take larger deer.
Members of the public are permitted to hunt WTD in Ohio during an explicit timeframe each autumn (following WTD mating season). Hunter-harvested WTD samples were collected at locations where other cervid disease surveillance programs were being conducted concurrently. Samples from culled WTD were collected in partnership with each respective population management program at their individual deer processing sites. Culled WTD are typically baited at their respective reservations and managed land sites (where hunting by the general public is not permitted). Professional sharpshooters then harvest WTD from those sites to reduce WTD population in accordance with the deer management plan for that location. Nasal swabs were collected from free-ranging WTD during the six-month period from October 2021 to March 2022. Most samples were collected during November-December 2021, corresponding to the gun hunting season in Ohio ( Figure S1). Sample collectors from all partner organizations were trained on standard sample collection methods. Collectors wore a facemask and changed gloves between all samples. Two sterile polyester tipped swabs were used during nasal swab collection. The rst swab cleared any debris from the exterior of the nostrils. A second sterile swab was inserted fully, scraped epithelial cells and nasal uids of both nasal passageways and placed into a tube containing 3mL viral transport media (BD UVT cat #220220). Post-collection, nasal swab samples were chilled temporarily in the eld until transport to our laboratory at The Ohio State University and stored at -80°C until diagnostic testing. We also collected blood samples to test for antibodies against SARS-CoV-2 in collaboration with USDA. We collected blood samples from 1164 of the 1522 WTD from which we collected nasal swabs. Blood was collected using 2 high purity cellulose ber lter paper (Nobuto strips) dipped into pooled blood from WTD carcasses to saturate the paper with approximately 0.1mL of blood. Nobuto strips were labeled and dried prior to transport.
Any carcasses that did not appear fresh, had already been substantially processed, exterior blood appeared heavily contaminated, or did not have visible blood sources available for collection were excluded. Due to post-mortem sample collection, the study was exempt from a scienti c permit from the Ohio Department of Natural Resources and beyond the scope of The Ohio State University Institutional Animal Care and Use Committee. Diagnostic Testing. Viral RNA extraction was conducted using the Omega Bio-tek Mag-Bind Viral DNA/RNA 96 kit (cat# M6246·03), with 200µl of sample 1 . Extracted viral RNA from samples was tested via real-time reverse transcription polymerase chain reaction (rRT-PCR) using the E gene primer/probe panel (Integrated DNA Technologies, Inc. cat #1006804) with Xeno VIC Internal Control Assay (Life Technologies cat #A29765) as previously described 1 . Any sample with a cycle threshold (Ct) value of 40 or below were considered E assay positive. All samples that screened E assay positive were con rmed using the Charité/Berlin RdRp con rmatory assay 2 (Integrated DNA Technologies, Inc. cat #10006805) or the CDC N1/N2 kit (Integrated DNA Technologies, Inc. cat #10006606) diagnostic procedure. Samples with a Ct of ≤40 on either con rmatory assay were classi ed as positive for SARS-COV-2.
Blood samples were tested for evidence of prior exposure to SARS-CoV-2 as previously described with minor modi cations 3 .
Brie y, antibody elution from Nobuto strips was accomplished by incubating each strip in 1 mL BupH Tris-buffered saline (pH Data visualization. Maps were generated using ArcMap (ESRI). For statistical analysis, location was coded using USDA Rural Urban Continuum Codes 5 . Codes range on a scale of 1-9 with increasing numbers corresponding to decreasing population size. For analysis purposes, an RUCC of 1 was considered urban -corresponding to counties surrounding Ohio's three major metropolitan areas of Cincinnati, Cleveland, and Columbus. All other RUCC values (two through nine) were considered rural for analysis, but smaller metropolitan areas that fall into this category are indicated to aid in understanding spatial clustering. To evaluate risk factors associated with detection of SARS-CoV-2, we t a mixed-effects logistic regression model (STATA 14.2, StataCorp LLC). We expected our data to violate the assumption of independence for logistic regression based on the spatial clustering of infectious disease outbreaks along with what we observed in our SARS-CoV-2 genomic sequences and therefore included random intercepts for county of sample collection to account for this clustering. A highly signi cant (p-value < 0.00005) likelihood ratio test compared to a standard logistic model supported that the mixed-effects model was better t for our data. Fixed effects were estimated for urban vs rural binary classi cation for the county, culled vs hunted WTD, WTD sex, WTD age, and a categorical effect for the month of sample collection to evaluate any changes over the course of the season (Table S3).
Phylogenetic analysis. First, to determine how viruses obtained from WTD in Ohio (n = 80) were genetically related to SARS-CoV-2 viruses circulating in humans in Ohio and mink and WTD in other North American locations, a background dataset of complete genome sequences was compiled from GISAID (downloaded May 24, 2022). A total of 44,456 human SARS-CoV-2 viruses, collected in Ohio during February 2, 2020 -May 10, 2022, were downloaded from GISAID. This study bene ted from the availability of a large number of SARS-CoV-2 sequences generated by the Ohio Department of Health, the US Centers for Disease Control, and other laboratories. County-level data was provided by the Ohio State Department of Health, which con rmed that the data set was spatially representative, with the number of sequences available from an Ohio county proportional to the population size of the county ( Figure S21). Pangolin 6 was used to assign a lineage to each human virus and ten viruses from each Pango lineage were randomly selected for the nal background dataset using a customized Python script (n = 1,592 human viruses). Outbreaks of genetically similar viruses from mink in Canada or a US state (Michigan, Oregon, Utah, and Wisconsin) were downsampled to 5 viruses per day, resulting in a total of 140 North American mink viruses in the nal dataset. In addition, 145 viruses from WTD were included from Canada and 14 US states (Arkansas, Illinois, Iowa, Kansas, Maine, Massachusetts, Minnesota, New Jersey, New York, North Carolina, Oklahoma, Pennsylvania, Tennessee, and Virginia). The dataset was aligned using NextClade with Wuhan-Hu-1 as a reference. In-house python scripts were used to remove non-coding regions and mask sites that are known to be unreliable. A phylogenetic tree was inferred from this dataset using maximum-likelihood methods available in IQ-TREE version 1.6.12 with a GTR + G model of nucleotide substitution and 1,000 bootstrap replicates, using the high-performance computational capabilities of the Biowulf Linux cluster at the National We performed a time-scaled Bayesian analysis using the Markov chain Monte Carlo (MCMC) method available using the latest version of the BEAST 7 package available on GitHub (compiled on October 20, 2022), using GPUs available from the NIH Biowulf Linux cluster. A host-speci c local clock 8 was used to accommodate differences in the evolutionary rate between WTD and humans. Since WTD viruses were not monophyletic on the alpha or delta tree, owing to multiple independent human-to-deer transmission events, separate WTD transmission clusters identi ed on the ML tree were speci ed. The analysis was performed two ways, excluding WTD singleton viruses that are not positioned in a WTD transmission cluster and including WTD singleton viruses. A Bayesian non-parametric demographic model 9 was used, with a general-time reversible (GTR) model of nucleotide substitution with gamma-distributed rate variation among sites. The MCMC chain was run separately 3-5 times for each dataset using the BEAGLE 3 10 library to improve computational performance, until all parameters reached convergence, as assessed visually using Tracer v.1.7.2. At least 10% of the chain was removed as burn-in, and runs for the same dataset were combined using LogCombiner v1.10.4. A MCC tree was summarized using TreeAnnotator v.1.10.4. To compare evolutionary rates across different regions of the SARS-CoV-2 genome, the analysis was repeated using ve genome partitions: ORF1a, ORF1b, ORF3-ORF8, spike (S), and nucleoprotein (N). Several additional analyses were performed, including a phylogeographic discrete trait analysis 11 to quantify rates of viral gene ow, particularly in the directions of human-to-deer and long-distance (across Ohio county lines) deer-to-deer transmission. A location state was speci ed for each viral sequence. All human viruses were categorized as "human," whereas WTD viruses were also categorized by location of collection, with state information used for WTD viruses collected outside of Ohio and county information provided for all Ohio WTD viruses collected for this study. The expected number of location state transitions in the ancestral history conditional on the data observed at the tree tips was estimated using 'Markov jump' counts 12,13 , which provided a quantitative measure of asymmetry in gene ow between de ned populations. To estimate absolute rates of synonymous and non-synonymous substitutions as well as dN/dS, we employ a 'renaissance counting' procedure that combines Markov jump counting with empirical Bayes modelling 14 . The outputs of these analyses (Markov jump counts, evolutionary rate distributions) were summarized and visualized using customized R scripts. Finally, we performed a exible random-effects analysis of the evolutionary substitution process 15 to capture mutational bias differences along the human and WTD branches of evolutionary history. Mutational bias was measured by deviations from the Hasegawa-Kishino-Yano (HKY) substitution model that accommodates unequal base frequencies and different rates of transition and transversion substitutions. To facilitate e cient sampling of the additional random-effects parameters, this analysis took advantage of gradient-based Hamiltonian Monte Carlo for phylogenetics within BEAST 16 .
Epidemiological data. The epidemiological curve of SARS-CoV-2 cases in humans in Ohio from January 1, 2021 to January 22, 2022 was generated using the number of daily reported COVID-19 cases in the state of Ohio (all age groups), available from the US Centers for Disease Control and Prevention (https://data.cdc.gov/Case-Surveillance/COVID-19-Case-Surveillance-Public-Use-Data-with-Ge/n8mc-b4w4). To estimate the proportion of COVID-19 cases belonging to different Pango lineages during each week of the epidemic, SARS-CoV-2 sequences collected from humans in Ohio during this time period were downloaded from GISAID. To account for inconsistencies in the intensity of viral surveillance, the number of viruses per lineage per week was normalized against the epidemiological curve derived from COVID-19 case counts and visualized using R. To further minimize biases only sequences categorized in the GISAID submission as obtained using a 'baseline surveillance' sampling strategy were included in the analysis. The dataset was further trimmed to include only submissions with complete collection dates and su cient coverage to assign a Pango lineage, resulting in a nal dataset of 27,187 sequences from Ohio. For simplicity, sub-lineages of B.1.617.2 (for example, AY.3) were consolidated into the Delta category, sub-lineages of B.1.1.7 (for example, Q.3) were consolidated into the Alpha category, and sub-lineages of B.1.1.529 (e.g., BA.1) were consolidated into the Omicron category.
Mutation analysis. We used root-to-tip regression to visualize the rate of substitution over time for the omicron variant, deer viruses, and all other SARS-CoV-2 lineages. To correctly assign mutation status during annotation to alignment in the phylogenetic analysis section, reference Wuhan genome (NC_045512.2) was added with MAFFT version 7.475 17 ; the Wuhan genome was then removed to preserve the original set of sequences, while allowing for the correct alignment length representing all positions in the SARS-CoV-2 genome. This alignment together with the corresponding phylogenetic tree (see Phylogenetic analysis section) were used to reconstruct states at all tree nodes with TreeTime ancestral v 0.9.0-b.2 using default parameters. Mutations were extracted from the tree and reformatted into vcf format. Obtained vcf les were annotated with SnpEff v 4.5, and NC_045512.2 was utilized as a reference. Because mutations that happen along the tree do not always have the same nucleotide in REF eld as position in genome, we corrected the annotation for the cases when those two did not match. The nal annotated vcf and phylogenetic tree were used to count the number of mutations occurring from root to each leaf. We considered all mutations, synonymous and missense independently. The obtained results were visualized in R. To calculate the linear regression slope we excluded all WTD samples and human omicron data.
To accurately analyze mutations accumulating in WTD on delta and alpha backgrounds, datasets described in the Bayesian analysis section were utilized. First, we added reference to each alignment as described above, and then reconstructed phylogenetic trees with IQTree using the following parameters: --polytomy -m GTR+G --alrt 1000. Root in both cases was placed at the reference genome. States at nodes were reconstructed as described above. VCF les were produced independently for WTD clusters and WTD singletons. In alpha dataset we utilized not only the two clusters from Ohio, but all available alpha clusters (6 in total, Figure S12). VCF annotation was produced as described above. Mutations in known problematic sites were ltered out using a list available at https://github.com/W-L/ProblematicSites_SARS-CoV2. Individual transmission clusters were visualized with ete3 python package 18 . Observed mutations on the spike trimer were visualized using the Protein Data Bank (PDB; rcsb.org), structure ID 7JJI. Structure visualization was performed with Open-Source PyMol version 2.4.0.
R-package MutationalPatterns was utilized to reconstruct mutational contexts. For input, we utilized mutations in WTD clusters and data on mutations in humans, the latter of which was extracted from the public version of the UShER (Ultrafast Sample placement on Existing tRee) tree downloaded on 2022-07-01 (http://hgdownload.soe.ucsc.edu/goldenPath/wuhCor1/UShER_SARS-CoV-2/2022/07/01/public-2022-07-01.all.masked.pb.gz), containing 5.7 million sequences. Variability in mutational contexts in humans was estimated by producing a series of subsamples containing the same number of mutations as observed in WTD delta clusters. The 10 subsamples were utilized to perform a permutation test to estimate signi cance of elevated C>T rate in WTD.
Selection analysis. The HyPhy package was used to study positive and negative selection 19 . For this analysis we selected the four genes with highest number of homoplastic sites in clusters (N, S, ORF3a and nsp3). Samples that contained missing data (Ns) in the studied gene were removed, because HyPhy is unable to perform calculations on the missing data. Three different methods were run for each gene: aBSREL and BUSTED to check for positive selection in particular genes in WTD, and MEME to look for individual sites under positive selection. We independently tested selection for two sets of branches: all branches within WTD transmission clusters (called later on 'clusters'); and clusters set plus singletons and the branches leading to transmission clusters (called 'all'). While the clusters set represented the mutations happening only within the WTD population, the 'all' set was potentially contaminated by mutations that happened before the virus was transmitted to WTD, but it allowed incorporation of all available WTD samples. To search for sites under positive selection with MEME, we only looked for sites in foreground branches (e.g. included in 'cluster' or 'all' sets) with p < 0.001 in comparison with all other branches on the phylogenetic tree (background). The analyses for alpha and delta datasets were performed independently on the phylogenetic trees described in the previous section. Foreground branches were marked on the phylogenetic tree with phylotree.js (http://veg.github.io/phylotree.js/#). dN/dS value provided in the Supplementary table S9 were extracted from MEME output. Results of aBSREL and BUSTED outputs showed no signs of gene wise positive selection. As an alternative approach, we used codeml (from PAML version 4.9e) to test whether there are signs of positive selection on branches leading for transmission clusters. Codeml was run in two modes with x_omega =1 and x_omega = 0. The LRT and p-values were calculated from obtained lnL values for each cluster. Again, no signs of gene wise positive selection were found. To study the frequencies of mutations in human populations for comparison to WTD, we utilized the outbreak.info package for R 20 that utilizes GISAID data. Data was visualized with ggplot2 package for R 4.  (Gibco 15240-062)). After 1 hour, the inoculum was aspirated off and fresh infection media was added to the cells. Cultures were checked daily, and media-cell suspension was harvested when greater than 90% cytopathic effect (CPE) was observed. The suspension was tested by BD Veritor System for rapid detection of SARS-CoV-2 (Catalog # 256082) for con rmation of virus isolation and streaked on blood agar plates for sterility.
TCID 50 assay. Virus stocks or experimental samples were titered using a VeroE6 cell line ectopically expressing the TMPRSS2 gene (Vero E6 T2) sourced from JCRB Cell Bank in Japan (https://cellbank.nibiohn.go.jp/english/). To determine the 50% tissue culture infectious dose (TCID 50 ), 96 well culture plates were inoculated with 100uL of a 1:10 serially diluted sample in infection media. After 72 hours the plates were xed and stained with 0.1% crystal violet in 10% formalin. Infectious dose titers were determined using the Reed and Muench method 21 .
Vaccination. Male LVG Golden Syrian Hamster, 4-5 weeks of age, were purchased from Charles River Laboratories (Wilmington, MA), assigned numbers sequentially upon arrival to the Animal Resource Center, and assigned to groups based on vaccine treatment and virus challenge on paper without investigators observing individual animals. There were 10 total experimental groups with 120 total hamsters. Hamsters assigned to vaccine groups were vaccinated intramuscularly with 10ug BNT162b2 vaccine (New York, NY), prepared as instructed with the exception that it had reached an expiration date and was no longer suitable for clinical use. Vaccine was administered in 50ul at 2 injection sites (100ul total volume) of the rear hind limb. Animals were boosted by the same procedure, on alternate limb, 21 days post vaccination (dpv). Sera was collected 21 days post boost (B+21) and assessed for antibody response by microneutralization.
Animal challenge. Approximately 3 weeks post boost, vaccinated or naïve control hamsters were inoculated intranasally with 10 4 TCID 50 of SARS-Cov-2 virus. Each experimental group had 5 animals that were used for weight loss measurements, 4 animals that were sacri ced on 2-days post inoculation (dpi), and 3-4 animals that were sacri ced on 4 dpi for a total of 12-13 animals per group. Except the Human B.1.1.7 and Deer AY.25, which had 4, 3, and 3 animals for each of those timepoints respectively (10 total per group). These groups were limited by animal availability and included an unvaccinated control group for cross virus comparisons only ( Figure 5B). At 2 and 4 dpi, animals were sacri ced for lung and nasal turbinate. Alternatively, on 2, 4 and 6 dpi animals were anesthetized with 100mg/kg Ketamine and nasal passages were rinsed with 0.5mL phosphate buffered saline (PBS). Infectious viral load was determined by TCID 50 as described. Longitudinal animals were scored for clinical signs and weighed daily for 2 weeks. All surviving animals were exsanguinated at 21 dpi and the serum was collected for microneutralization comparison. Animals were housed individually in standard lter top rat cages with day/night cycle from 6am-6pm. Animal health observations were made at least 1/day or 2/day during peak infection. We did not observe any adverse events in these experiments other than the expected animal weight loss. Fivefold serial dilutions were performed on heat inactivated sera (1 hour at 56°C) in infection medium, starting at a 1:40 dilution. A standardized amount of infectious SARS-CoV-2 virus (250 TCID 50 ), diluted in infection medium, was added to the diluted serum at a 1:1 ratio and incubated for 1 hour at 37°C. A volume of 100uL of the serum/virus mixture was added to Vero E6 T2 cells seeded in 96-well plates the previous day and incubated for 1 hour at 37°C under 5% CO2. Subsequently, an additional 100uL of infection media was added and the cells incubated for a further 24 -48 hours. Following incubation, cells were xed with 4% formaldehyde (Polysciences Cat #18814-20) for 30 mins, washed with PBS (source) three times and then incubated with a block/permeabilization buffer (PBS supplemented with 3% Bovine Serum Albumin (BSA; Sigma-Aldrich Cat #A8327-500ml) and 0.2% Triton-X-100 (ThermoFisherSurfact-Amps-X-100, 10% Solution Cat #28314)) for 30 minutes. Rabbit anti-SARS CoV-2 NP mAb (Sinobiologicals Cat # 40143-R040) at a 1:2000 dilution was added for 1 hour. Cells were washed three times with PBS supplemented with 0.5% Tween (PBST; Thermo sher Cat #28314) before incubation with a secondary goat anti-rabbit IgG -HRP conjugated antibody (Cell Signaling Cat# 7074S)) at a 1:3000 dilution for 1 hour. After washing the cells three times with PBST, 100uL of TMB (Thermo sher Cat #N301) was added and color developed for 10 mins before 1N sulfuric acid (Fisher Scienti c Cat #SA212-1) was added to stop the reaction. The optical density was measured at 450nm on a Biotek Synergy plate microplate reader and the neutralization titers were calculated as the reciprocal serum dilution (IC 50 ) causing 50% reduction of relative light units.  Writing -review & editing: AFM, ,DH, AMW, SAF, AK, ST, LM, TJ, AP, JMN, MJT, JTG, AJM, KK, TJL, SNB, JBL, JCC, TJD, EVK (Table S2). Counties marked with an asterisk indicate counties from which samples were collected from culled WTD as a part of population management programs (Table S1).  Map of SARS-CoV-2 transmission clusters in Ohio white-tailed deer. Each shape represents a county in Ohio where SARS-CoV-2 virus was identi ed in WTD for this study (triangle = alpha variant; circle = delta variant). Large circles indicate WTD transmission clusters, as identi ed on the phylogenetic tree (black = clusters restricted to one county; shaded = clusters identi ed in more than one county). Large circles shaded the same color belong to the same transmission cluster. Small black circles indicate singleton WTD viruses. PANGO lineage provided for all clusters. Human population density is shown in the background (red = high; green = low) and major cities are labeled. Alpha results (similar) are provided in Figure S11. (B) Mutations in spike protein that were found in delta WTD clusters (orange), alpha WTD clusters (green), and both alpha and delta WTD clusters (yellow). All recurrent mutations from WTD clusters are documented in Table S8. The log deviation (random-effect) from HKY model relative rates is presented for (C) alpha, humans, (D) alpha, WTD, (E) alpha, WTD-to-human ratio, (F) delta, humans, (G) delta, WTD, and (H) delta, WTD-tohuman ratio. Box midlines indicate the median, the box limits show the upper and lower quartiles, and the whiskers extend to 1.5 times the interquartile range. Asterisks indicate transversions. WTD-to-human ratios that signi cantly differ from zero are highlighted.

Figure 5
Pathogenicity and replication of multiple strains of SARS-CoV-2 viruses in Golden Syrian hamsters. (A) Microneutralization titers of a-BNT162b2 or lineage speci c serum against representative viruses from this study. For B-E, the mean for each group is plotted, and bars indicate standard deviation. Titers expressed as log 10 EC 50 were plotted and described as a fold change from the reference strains. (B) Body weight loss comparison between unvaccinated and BNT162b2 vaccinated animals at the peak of infection, day 7. Mean weights are displayed as a percentage of starting weight. Nasal wash was collected (unvaccinated groups only) (C) or lung and nasal turbinate were harvested (D and E) and used to quantify viral titers.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.